9 research outputs found
Time-series Anomaly Detection based on Difference Subspace between Signal Subspaces
This paper proposes a new method for anomaly detection in time-series data by
incorporating the concept of difference subspace into the singular spectrum
analysis (SSA). The key idea is to monitor slight temporal variations of the
difference subspace between two signal subspaces corresponding to the past and
present time-series data, as anomaly score. It is a natural generalization of
the conventional SSA-based method which measures the minimum angle between the
two signal subspaces as the degree of changes. By replacing the minimum angle
with the difference subspace, our method boosts the performance while using the
SSA-based framework as it can capture the whole structural difference between
the two subspaces in its magnitude and direction. We demonstrate our method's
effectiveness through performance evaluations on public time-series datasets.Comment: 8pages, an acknowledgement was added to v
Adaptive occlusion sensitivity analysis for visually explaining video recognition networks
This paper proposes a method for visually explaining the decision-making
process of video recognition networks with a temporal extension of occlusion
sensitivity analysis, called Adaptive Occlusion Sensitivity Analysis (AOSA).
The key idea here is to occlude a specific volume of data by a 3D mask in an
input 3D temporal-spatial data space and then measure the change degree in the
output score. The occluded volume data that produces a larger change degree is
regarded as a more critical element for classification. However, while the
occlusion sensitivity analysis is commonly used to analyze single image
classification, applying this idea to video classification is not so
straightforward as a simple fixed cuboid cannot deal with complicated motions.
To solve this issue, we adaptively set the shape of a 3D occlusion mask while
referring to motions. Our flexible mask adaptation is performed by considering
the temporal continuity and spatial co-occurrence of the optical flows
extracted from the input video data. We further propose a novel method to
reduce the computational cost of the proposed method with the first-order
approximation of the output score with respect to an input video. We
demonstrate the effectiveness of our method through various and extensive
comparisons with the conventional methods in terms of the deletion/insertion
metric and the pointing metric on the UCF101 dataset and the Kinetics-400 and
700 datasets.Comment: 11 page
Discriminant feature extraction by generalized difference subspace
This paper reveals the discriminant ability of the orthogonal projection of data onto a generalized difference subspace (GDS) both theoretically and experimentally. In our previous work, we have demonstrated that GDS projection works as the quasi-orthogonalization of class subspaces. Interestingly, GDS projection also works as a discriminant feature extraction through a similar mechanism to the Fisher discriminant analysis (FDA). A direct proof of the connection between GDS projection and FDA is difficult due to the significant difference in their formulations. To avoid the difficulty, we first introduce geometrical Fisher discriminant analysis (gFDA) based on a simplified Fisher criterion. gFDA can work stably even under few samples, bypassing the small sample size (SSS) problem of FDA. Next, we prove that gFDA is equivalent to GDS projection with a small correction term. This equivalence ensures GDS projection to inherit the discriminant ability from FDA via gFDA. Furthermore, we discuss two useful extensions of these methods, 1) nonlinear extension by kernel trick, 2) the combination of convolutional neural network (CNN) features. The equivalence and the effectiveness of the extensions have been verified through extensive experiments on the extended Yale B+, CMU face database, ALOI, ETH80, MNIST and CIFAR10, focusing on the SSS problem
Resolving Marker Pose Ambiguity by Robust Rotation Averaging with Clique Constraints
Planar markers are useful in robotics and computer vision for mapping and
localisation. Given a detected marker in an image, a frequent task is to
estimate the 6DOF pose of the marker relative to the camera, which is an
instance of planar pose estimation (PPE). Although there are mature techniques,
PPE suffers from a fundamental ambiguity problem, in that there can be more
than one plausible pose solutions for a PPE instance. Especially when
localisation of the marker corners is noisy, it is often difficult to
disambiguate the pose solutions based on reprojection error alone. Previous
methods choose between the possible solutions using a heuristic criteria, or
simply ignore ambiguous markers.
We propose to resolve the ambiguities by examining the consistencies of a set
of markers across multiple views. Our specific contributions include a novel
rotation averaging formulation that incorporates long-range dependencies
between possible marker orientation solutions that arise from PPE ambiguities.
We analyse the combinatorial complexity of the problem, and develop a novel
lifted algorithm to effectively resolve marker pose ambiguities, without
discarding any marker observations. Results on real and synthetic data show
that our method is able to handle highly ambiguous inputs, and provides more
accurate and/or complete marker-based mapping and localisation.Comment: 7 pages, 4 figures, 4 table
Temporal-stochastic tensor features for action recognition
In this paper, we propose Temporal-Stochastic Product Grassmann Manifold (TS-PGM), an efficient method for tensor classification in tasks such as gesture and action recognition. Our approach builds on the idea of representing tensors as points on Product Grassmann Manifold (PGM). This is achieved by mapping tensor modes to linear subspaces, where each subspace can be seen as a point on a Grassmann Manifold (GM) of the corresponding mode. Subsequently, it is possible to unify factor manifolds of respective modes in a natural way via PGM. However, this approach possibly discards discriminative information by treating all modes equally, and not considering the nature of temporal tensors such as videos. Therefore, we introduce Temporal-Stochastic Tensor features (TST features) to extract temporal information from tensors and encode them in a sequence-preserving TST subspace. These features and regular tensor modes can then be simultaneously used on PGM. Our framework addresses the problem of classification of temporal tensors while inheriting the unified mathematical interpretation of PGM because the TST subspace can be naturally integrated into PGM as a new factor manifold. Additionally, we enhance our method in two ways: (1) we improve the discrimination ability by projecting subspaces onto a Generalized Difference Subspace, and (2) we utilize kernel mapping to construct kernelized subspaces able to handle nonlinear data distribution. Experimental results on gesture and action recognition datasets show that our methods based on subspace representation with explicit TST features outperform pure spatio-temporal approaches